bayesian calibration
Bayesian Calibration and Model Assessment of Cell Migration Dynamics with Surrogate Model Integration
Schenk, Christina, Jiménez, Jacobo Ayensa, Romero, Ignacio
Computational models provide crucial insights into complex biological processes such as cancer evolution, but their mechanistic nature often makes them nonlinear and parameter-rich, complicating calibration. We systematically evaluate parameter probability distributions in cell migration models using Bayesian calibration across four complementary strategies: parametric and surrogate models, each with and without explicit model discrepancy. This approach enables joint analysis of parameter uncertainty, predictive performance, and interpretability. Applied to a real data experiment of glioblastoma progression in microfluidic devices, surrogate models achieve higher computational efficiency and predictive accuracy, whereas parametric models yield more reliable parameter estimates due to their mechanistic grounding. Incorporating model discrepancy exposes structural limitations, clarifying where model refinement is necessary. Together, these comparisons offer practical guidance for calibrating and improving computational models of complex biological systems.
Bayesian calibration of stochastic agent based model via random forest
Robertson, Connor, Safta, Cosmin, Collier, Nicholson, Ozik, Jonathan, Ray, Jaideep
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation.
PINN surrogate of Li-ion battery models for parameter inference. Part II: Regularization and application of the pseudo-2D model
Hassanaly, Malik, Weddle, Peter J., King, Ryan N., De, Subhayan, Doostan, Alireza, Randall, Corey R., Dufek, Eric J., Colclasure, Andrew M., Smith, Kandler
Bayesian parameter inference is useful to improve Li-ion battery diagnostics and can help formulate battery aging models. However, it is computationally intensive and cannot be easily repeated for multiple cycles, multiple operating conditions, or multiple replicate cells. To reduce the computational cost of Bayesian calibration, numerical solvers for physics-based models can be replaced with faster surrogates. A physics-informed neural network (PINN) is developed as a surrogate for the pseudo-2D (P2D) battery model calibration. For the P2D surrogate, additional training regularization was needed as compared to the PINN single-particle model (SPM) developed in Part I. Both the PINN SPM and P2D surrogate models are exercised for parameter inference and compared to data obtained from a direct numerical solution of the governing equations. A parameter inference study highlights the ability to use these PINNs to calibrate scaling parameters for the cathode Li diffusion and the anode exchange current density. By realizing computational speed-ups of 2250x for the P2D model, as compared to using standard integrating methods, the PINN surrogates enable rapid state-of-health diagnostics. In the low-data availability scenario, the testing error was estimated to 2mV for the SPM surrogate and 10mV for the P2D surrogate which could be mitigated with additional data.
Learning Multi-Task Gaussian Process Over Heterogeneous Input Domains
Liu, Haitao, Wu, Kai, Ong, Yew-Soon, Bian, Chao, Jiang, Xiaomo, Wang, Xiaofang
Multi-task Gaussian process (MTGP) is a well-known non-parametric Bayesian model for learning correlated tasks effectively by transferring knowledge across tasks. But current MTGPs are usually limited to the multi-task scenario defined in the same input domain, leaving no space for tackling the heterogeneous case, i.e., the features of input domains vary over tasks. To this end, this paper presents a novel heterogeneous stochastic variational linear model of coregionalization (HSVLMC) model for simultaneously learning the tasks with varied input domains. Particularly, we develop the stochastic variational framework with Bayesian calibration that (i) takes into account the effect of dimensionality reduction raised by domain mappings in order to achieve effective input alignment; and (ii) employs a residual modeling strategy to leverage the inductive bias brought by prior domain mappings for better model inference. Finally, the superiority of the proposed model against existing LMC models has been extensively verified on diverse heterogeneous multi-task cases and a practical multi-fidelity steam turbine exhaust problem.
Bayesian Calibration of imperfect computer models using Physics-informed priors
Spitieris, Michail, Steinsland, Ingelin
In this work we introduce a computational efficient data-driven framework suitable for quantifying the uncertainty in physical parameters of computer models, represented by differential equations. We construct physics-informed priors for differential equations, which are multi-output Gaussian process (GP) priors that encode the model's structure in the covariance function. We extend this into a fully Bayesian framework which allows quantifying the uncertainty of physical parameters and model predictions. Since physical models are usually imperfect descriptions of the real process, we allow the model to deviate from the observed data by considering a discrepancy function. For inference Hamiltonian Monte Carlo (HMC) sampling is used. This work is motivated by the need for interpretable parameters for the hemodynamics of the heart for personal treatment of hypertension. The model used is the arterial Windkessel model, which represents the hemodynamics of the heart through differential equations with physically interpretable parameters of medical interest. As most physical models, the Windkessel model is an imperfect description of the real process. To demonstrate our approach we simulate noisy data from a more complex physical model with known mathematical connections to our modeling choice. We show that without accounting for discrepancy, the posterior of the physical parameters deviates from the true value while when accounting for discrepancy gives reasonable quantification of physical parameters uncertainty and reduces the uncertainty in subsequent model predictions.
BayCANN: Streamlining Bayesian Calibration with Artificial Neural Network Metamodeling
Jalal, Hawre, Alarid-Escudero, Fernando
Purpose: Bayesian calibration is theoretically superior to standard direct-search algorithm because it can reveal the full joint posterior distribution of the calibrated parameters. However, to date, Bayesian calibration has not been used often in health decision sciences due to practical and computational burdens. In this paper we propose to use artificial neural networks (ANN) as one solution to these limitations. Methods: Bayesian Calibration using Artificial Neural Networks (BayCANN) involves (1) training an ANN metamodel on a sample of model inputs and outputs, and (2) then calibrating the trained ANN metamodel instead of the full model in a probabilistic programming language to obtain the posterior joint distribution of the calibrated parameters. We demonstrate BayCANN by calibrating a natural history model of colorectal cancer to adenoma prevalence and cancer incidence data. In addition, we compare the efficiency and accuracy of BayCANN against performing a Bayesian calibration directly on the simulation model using an incremental mixture importance sampling (IMIS) algorithm. Results: BayCANN was generally more accurate than IMIS in recovering the "true" parameter values. The ratio of the absolute ANN deviation from the truth compared to IMIS for eight out of the nine calibrated parameters were less than one indicating that BayCANN was more accurate than IMIS. In addition, BayCANN took about 15 minutes total compared to the IMIS method which took 80 minutes. Conclusions: In our case study, BayCANN was more accurate than IMIS and was five-folds faster. Because BayCANN does not depend on the structure of the simulation model, it can be adapted to models of various levels of complexity with minor changes to its structure. We provide BayCANN's open-source implementation in R.
Fixed Inducing Points Online Bayesian Calibration for Computer Models with an Application to a Scale-Resolving CFD Simulation
Duan, Yu, Eaton, Matthew, Bluck, Michael
This paper proposes a novel fixed inducing points online Bayesian calibration (FIPO-BC) algorithm to efficiently learn the model parameters using a benchmark database. The standard Bayesian calibration (STD-BC) algorithm provides a statistical method to calibrate the parameters of computationally expensive models. However, the STD-BC algorithm scales very badly with the number of data points and lacks online learning capability. The proposed FIPO-BC algorithm greatly improves the computational efficiency and enables the online calibration by executing the calibration on a set of predefined inducing points. To demonstrate the procedure of the FIPO-BC algorithm, two tests are performed, finding the optimal value and exploring the posterior distribution of 1) the parameter in a simple function, and 2) the high-wave number damping factor in a scale-resolving turbulence model (SAS-SST). The results (such as the calibrated model parameter and its posterior distribution) of FIPO-BC with different inducing points are compared to those of STD-BC. It is found that FIPO-BC and STD-BC can provide very similar results, once the predefined set of inducing point in FIPO-BC is sufficiently fine. But, the FIPO-BC algorithm is at least ten times faster than the STD-BC algorithm. Meanwhile, the online feature of the FIPO-BC allows continuous updating of the calibration outputs and potentially reduces the workload on generating the database.
Variational Inference with Vine Copulas: An efficient Approach for Bayesian Computer Model Calibration
Kejzlar, Vojtech, Maiti, Tapabrata
The ever-growing access to high performance computing in scientific communities has enabled development of complex computer models in fields such as nuclear physics, climatology, and engineering that produce massive amounts of data. These models need real-time calibration with quantified uncertainties. Bayesian methodology combined with Gaussian process modeling has been heavily utilized for calibration of computer models due to its natural way to account for various sources of uncertainty; see Higdon et al. (2015), and King et al. (2019) for examples in nuclear physics, Sexton et al. (2012) and Pollard et al. (2016) for examples in climatology, and Lawrence et al. (2010), Plumlee et al. (2016) and Zhang et al. (2019) for applications in engineering, astrophysics, and medicine. The original framework for Bayesian calibration of computer models was developed by Kennedy and O'Hagan (2001) with extensions provided by Higdon et al. (2005, 2008); Bayarri et al. (2007); Plumlee (2017, 2019), and Gu and Wang (2018), to name a few. Despite its popularity, however, Bayesian calibration becomes infeasible in big-data scenarios with complex and many-parameter models because it relies on Markov chain Monte Carlo (MCMC) algorithms to approximate posterior densities. This text presents a scalable and statistically principled approach to Bayesian calibration of computer models. We offer an alternative approximation to posterior densities using variational Bayesian inference (VBI), which originated as a machine learning algorithm that approximates a target density through optimization. Statisticians and computer scientists (starting with Peterson and Anderson (1987); Jordan et al. (1999)) have been widely using variational techniques because they tend to be faster and easier to scale to massive datasets. Moreover, the recently published frequentist consistency of variational Bayes by Wang and Blei (2018) established VBI as a theoretically valid procedure.
Learning and Optimization with Bayesian Hybrid Models
Eugene, Elvis A., Gao, Xian, Dowling, Alexander W.
Bayesian hybrid models fuse physics-based insights with machine learning constructs to correct for systematic bias. In this paper, we compare Bayesian hybrid models against physics-based glass-box and Gaussian process black-box surrogate models. We consider ballistic firing as an illustrative case study for a Bayesian decision-making workflow. First, Bayesian calibration is performed to estimate model parameters. We then use the posterior distribution from Bayesian analysis to compute optimal firing conditions to hit a target via a single-stage stochastic program. The case study demonstrates the ability of Bayesian hybrid models to overcome systematic bias from missing physics with less data than the pure machine learning approach. Ultimately, we argue Bayesian hybrid models are an emerging paradigm for data-informed decision-making under parametric and epistemic uncertainty.
Variational Calibration of Computer Models
Marmin, Sébastien, Filippone, Maurizio
Bayesian calibration of black-box computer models offers an established framework to obtain a posterior distribution over model parameters. Traditional Bayesian calibration involves the emulation of the computer model and an additive model discrepancy term using Gaussian processes; inference is then carried out using MCMC. These choices pose computational and statistical challenges and limitations, which we overcome by proposing the use of approximate Deep Gaussian processes and variational inference techniques. The result is a practical and scalable framework for calibration, which obtains competitive performance compared to the state-of-the-art.